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Abstract 

Given  m  points  in  the  plane  and  a  threshold  t.  a  curve  is  defined  to  be  robust  if  at 
least  (  points  lie  on  it.  Efficient  algorithms  for  delecting  robust  curves  are  given;  the 
key  contribution  is  to  use  randomized  sampling.  In  addition,  an  approximate  version 
of  the  problem  is  introduced.  A  geometric  solution  to  this  problem  is  given;  it  too  can 
be  enhanced  by  randomization. 

These  algorithms  are  readily  generalized  to  solve  the  problem  of  robust  curve  detec- 
tion in  a  scene  of  curve  fragments:  given  a  set  of  curve  segments,  a  curve  a  is  defined 
to  be  robust  if  curve  segments  of  total  length  at  least  /  lie  on  a.  Again,  both  an  exact 
and  an  approximate  version  of  the  problem  are  considered. 

The  problems  and  solutions  are  closely  related  to  the  well-investigated  Hough 
Transform  technique. 

1      Introduction 

A  recent  survey  paper  by  Illingworth  ajid  Kittlrr  [IKS8]  refers  to  the  Hough  TrEinsform, 
HT,  as  "...  a  technique  of  almost  unique  promis/'  for  shape  and  motion  anadysis  in  images 
containing  noisy,  missing,  ajid  extraneous  data  but  its  adoption  has  been  slow  due  to  its 
computational  and  storage  complexity  ...",  and  cites  144  papers  written  on  it  by  1988. 

On  the  skepticad  side,  a  few  papers  (e.g.,  [Br83]  and  [GH90])  ofTer  criticism  of  the  HT 
technique.  Most  criticism  is  directed  towards  the  sensitivity  of  the  HT  and  suggests  not 
to  use  it  blithely. 

Wliile  the  present  paper  is  inspired  by  HT.  it  applies  a  notion  of  robust  curves  to 
overcome  the  sensitivity  related  criticism,  while  majntaining  the  power  of  HT.  We  hope 
that  our  algorithms  will  contribute  ideas  to  designers  of  softwetre  systems  for  scene  analysis. 
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Our  meiin  concern  in  this  presentation  is  to  address  computational  (and  storage)  efficiency 
issues. 

Given  m  points  in  the  plane  and  a  threshold  t,  a  curve  is  defined  to  be  robust  if  at 
least  t  points  lie  on  it.  Alternatively,  given  m  line  segments  in  the  plane  and  a  length 
threshold  /,  a  curve  is  defined  to  be  robust  if  Hne  segments  of  total  length  at  least  /  lie 
on  it.  We  give  efficient  algorithms  for  detecting  robust  curves  under  both  definitions  (the 
"RSL  problem").  Section  3  deals  with  problems  where  the  points  (or  line  segments)  lie 
exactly  on  a  curve.  The  main  contribution  is  to  apply  randomization.  In  Section  4,  we 
introduce  a  similar  problem  (the  "e-RSL  problem"),  where  points  (or  line  segments)  are 
counted  even  if  they  only  he  neeu"  a  curve.  A  geometric  solution  which  can  be  enhanced 
by  randomization  is  given. 

A  powerful  algorithmic  methodology  which  is  used  extensively  in  the  present  paper  is 
randomization.  We  use  the  concept  of  randomized  algorithms  in  the  same  spirit  cis  advo- 
cated by  Rabin  [Ra76]:  The  algorithm  "flips  a  coin"  in  order  to  determine  its  next  move. 
We  then  analyze  the  time  and  space  requirements  of  the  algorithm.  Upper  bounds  for 
its  expected  running  time,  or  for  its  running  time  with  high  probability,  axe  given.  To 
avoid  misunderstanding,  we  emphasize  that  our  analysis  does  not  maJce  any  assumptions 
regarding  a  probabilistic  distribution  of  the  input;  all  our  results  hold  for  any  input.  Pre- 
vious use  of  randomization  in  the  context  of  HT  was  proposed  in  [FBS1,FF81,XOK90];  no 
analysis  was  given  in  these  papers.  Our  method  of  randomization  is  different.  Another 
recent  paper,  by  Bergen  and  Shvaytser  [BS90],  also  advocates  the  use  of  randomization  for 
achieving  faster  implementations  of  the  Hough  transform.  One  major  difference  between 
the  paper  by  Bergen  cind  Shvaytser  and  ours  lies  in  the  algorithms;  a  substantial  part  of 
our  paper  is  concerned  with  developing  new  algorithms,  whereas  Bergen  £md  Shvaytser  are 
concerned  with  more  efficient  implementations  of  the  standard  approaches.  Our  algorithms 
axe  more  efficient. 

A  novel  element  in  our  treatment  is  that  its  primary  concern  is  with  algorithmic  effi- 
ciency. There  is  also  a  contribution  in  introducing  a  (provably)  precise  approximation  in 
the  zdgorithm  for  the  e-RSL  problem.  While  the  real  numbers  are  presumably  represent- 
ed in  finite  precision  and  hence  are  discretized,  our  approach  has  the  advantage  that  our 
algorithms  do  not  impose  a  further  coarser  discretization.  "Traditional"  image  processing 
algorithms  (e.g.,  [BS90])  include  explicit  steps  that  chop  off  adl  but  a  certain  number  of 
the  most  significant  bits  of  some  real  values  (either  inputs  or  values  computed  from  the 
inputs).  This  leads  to  (unquantified)  approximate  results.  Actually,  it  is  not  cleax  whether 
these  approximate  results  can  be  quantified,  by  contraist  with  our  work.  (It  should  be 
noted  that  our  algorithms  do  not  concern  themselves  with  the  question  of  coping  with 
the  finite  precision  of  computer  eirithmetic;  it  is  aissumed  that  all  arithmetic  computations 
with  reals  are  of  sufficient  precision  axid  take  consteint  time.) 

We  draw  attention  to  two  aspects  of  the  present  paper: 

1.  Rather  thein  seeking  to  find  the  best  unique  solution  to  a  problem,  as  in  much  of  the 
pubhshed  work  on  algorithms,  our  output  definition  seeks  all  solutions  that  meet  the 


problem  specification;  we  Eire  thereby  reducing  the  seaxch  space  of  possible  solutions 
for  further  processing,  an  approach  common  in  the  computer  vision  and  artificial 
intelligence  communities. 

2.  We  believe  that  the  approximate  problem,  that  is,  the  recognition  of  curves  in  an 
image  coTnpri.'>ing  points,  where  the  image  is  noisy,  is  an  important  practical  problem. 
In  defining  it,  our  treatment  is  less  conservative  than  the  "standard  image  processing 
approach:"  Observing  that  the  Hough  transform  is  a  means  rather  than  a  goal,  we 
avoid  trying  directly  to  give  an  efficient  implementation  of  the  Hough  transform; 
instead,  we  focus  on  the  problem  being  solved  via  the  Hough  transform  and  solve 
that  problem  efficiently. 

We  describe  sequentiaJ  and  parallel  implementations  of  our  algorithms.  We  note  that 
the  parallel  algorithm  for  the  e-RSL  problem  is  quite  different  to  the  serial  algorithm, 
ajid  actually  provides  an  alternative  serial  algorithm.  The  parallel  algorithms  cire  for  the 
CREW  PRAM  model.  This  model  comprises  a  set  of  P  processors  together  with  a  shared 
memory.  Each  processor  is  able  to  access  emy  memory  location  in  constant  time.  However, 
while  concurrent  reads  are  allowed,  concurrent  writes  are  forbidden.  It  is  traditional  to 
state  complexity  results  as  a  {P,T)  pair,  where  P  denotes  the  number  of  processors  used 
and  T  the  parallel  time.  We  prefer  to  use  the  pair  (VF,  T),  where  T  denotes  the  parallel 
time,  as  before,  and  W  denotes  the  work  or  the  number  of  operations  performed;  W  —  PT. 
An  algorithm  that  has  complexity  {W,T)  can  be  implemented  to  run  on  P  processors  in 
parallel  time  0{^  +  T);  the  advantage  of  the  {W,T)  notation  is  that  it  shows  at  a  glance 
the  efficiency  of  the  algorithm  (expressed  by  the  W  term)  emd  the  fastest  parallel  time 
that  can  be  achieved  efficiently  (or  equivalently  the  maximum  amount  of  parallelism  that 
can  be  efficiently  used,  namely  y  processors). 

Our  style  of  presentation  emphasizes  ideas  whose  effectiveness  can  be  proven.  Other 
ideas,  which  seem  useful  but  whose  effectiveness  is  not  proven,  will  typically  be  deferred  to 
comments.  While  the  eilgorithms  we  develop  are  jJl  new,  we  will  note  wherever  components 
can  be  traced  to  algorithmic  paradigms. 

2      Preliminaries 

We  start  by  reviewing  the  concept  of  the  Hough  Treinsform.  See  also  [BB82]  for  details 
and  justifications.  Let  M  =  {(ii,  j/i ),  (xji 2/2)1  ••(^^miym)}  be  a  set  of  m  points  in  the 
pleme.  Observe  that  any  streught  line  in  the  plane  can  be  described  as  a  pair  {r,6)  defined 
as  follows.  For  the  purposes  of  our  definition  eind  in  order  to  distinguish  between  lines 
crossing  the  i-eixis  to  the  left  and  the  right  of  the  origin,  we  define  the  lines  to  be  oriented 
so  that  the  origin  is  to  their  right.  A  line  through  the  origin  is  oriented  in  the  increasing 
y  direction,  r  gives  the  normal  distance  of  the  line  from  the  origin.  The  parameter  6  gives 
the  angle  from  the  oriented  i-axis  to  the  oriented  line,  measured  cinticlockwise.  See  Figure 
1.  6  czm  tcike  any  value  in  the  range  [0,27r]  and  r  is  a  reed  number. 


Figure  1:  Definition  of  r  and  9  for  line  L 


Comment.  The  above  definition  is  not  continuous  at  r  =  0.  In  Section  4,  we  will  want 
a  continuous  mapping.  To  achieve  continuity,  we  map  each  line  to  two  points,  najnely 
(r,  ^)  as  defined  above  and  {—r,6  +  n)  (where  all  angle  computations  are  mod  2:r).  Next, 
we  explain  why  continuity  is  achieved.  Consider  a  line  L  with  a  fixed  orientation  that  is 
ad\'anced  from  the  left  side  of  the  origin  to  the  right  side.  The  two-point  mapping  gives 
the  pair  of  points  (r, ^)  and  {-r,9  -f  tt),  where  r  >  0.  When  the  line  crosses  the  origin, 
this  pair  switches  to  {r,6  +  tt)  and  {-r,9),  respectively,  with  r  >  0  still. 

All  straight  lines  that  contain  the  point  (x.y)  must  satisfy  the  following  equation 


r  =  X  sm  u  —  y  cos  ( 


:i) 


The  plane  with  respect  to  parameters  9  and  r  is  cadled  the  dual  plane.  Applying  equation 
1  separately  to  each  point  in  A/  transforms  M  into  a  set  of  m  sinusoidal  curves  in  the  dual 
plajie.  Consider  two  points  in  M.  They  define  two  curves  in  the  dual  plane.  The  points 
of  the  dual  plane  in  which  these  two  curves  intersect  gives  the  unique  straight  line  in  the 
(primal)  plane  that  includes  the  two  points  of  M.  In  fact,  it  suffices  to  restrict  attention 
to  the  6  interval  [0,7r];  then  two  of  these  sinusoidal  curves,  if  not  identical,  intersect  in  a 
single  point. 

Applications  of  HT  in  image  processing  often  look  for  intersection  points  among  pairs 
of  curves  in  the  ducd  plane.  A  point  in  which  several  curves  intersect  in  the  duaJ  plajie 
is  considered  meaningful  since  it  impUes  that  the  corresponding  straight  line  in  the  plane 
includes  severaJ  points  of  M.  Also,  clusters  of  intersection  points  in  the  dual  plane  are 
considered  meajiingful,  since  they  seem  to  "vote"  for  the  same  "object."  m  points  will 
induce  ©(m^)  intersection  points  in  the  duaJ  plane. 

In  the  present  paper  we  avoid  computing  all  intersection  points  explicitly.  This  will 
enable  our  algorithms  to  perform  <C  m^  operations  (where  <g;  means  smaller  in  order  of 
magnitude)  for  the  problems  we  define.  This  is  the  main  novel  feature  of  our  results.  The 
problems  we  define  emphasize  finding  straight  lines  (or  other  curves)  that  many  points 
in  M  vote  for.  Other  aspects  of  HT,  including  the  fact  that  this  is  a  transform,  are  less 
relex'ajit  for  our  problems. 

Two  probabilistic  propositions 


The  following  two  propositions  will  be  helpful.  Proposition  1  is  a  variant  of  Chernoff 's 
bounds,  which  is  due  to  [AV79]. 

Proposition  1    For  all  n,p,f3  with  0  <  p  <  I,  0  <  f3  <  I, 

[(l-J)npJ     /  N 

J2       (I)  P'n-Pr'  <  expi-3'np/2)  (2) 


.=0       v^-/ 
X:         ll]p'(l-  pT-'  <  exp(-/3^np/3)  (3) 

Proposition  2  will  be  used  for  lower  bounding  the  joint  probability  of  two  events  which 
are  not  necessarily  independent. 

Proposition  2  Let  E\  and  E2  be  any  two  events,  and  suppose  that  Proh(E\)  >  \  —  6  and 
Proh{E2)  >  1  -6.    Then  Prob{Eif]E2)  >l-26. 

3      Robust  Straight  Lines 

Let  M  be  a  set  of  m  points  in  the  plane  and  let  an  integer  t,  1  <  t  <  m,  he  called  a 
threshold.  A  straight  line  that  t,  or  more,  points  lie  on  is  called  robust.  Likewise,  let  S 
be  a  set  of  s  line  segments  in  the  plane  and  let  /  >  0,  a  real  number,  be  called  a  length 
threshold.  A  straight  line  that  contains  segments  of  totaJ  length  at  leeist  /  is  called  robust. 
It  will  always  be  clear  from  the  context  in  which  sense  the  term  robust  is  being  used.  In 
this  section  we  study  the  following  two  problems. 

The  Point  Robust  Straight  Lines  (point-RSL)  Detection  Problem. 

Input.  Set  M  of  points  and  threshold  t. 

Problem.  Find  all  straight  lines  that  are  robust. 

The  Segment  Robust  Straight  Lines  (segment-RSL)  Detection  Problem. 

Input.  Set  M  of  points  emd  threshold  /. 
Problem.  Find  cJl  straight  hnes  that  are  robust. 

The  RSL  problems  are  simpler  than  the  problems  of  Section  4  since  for  each  line,  only 
objects  that  lie  exactly  on  it  are  taken  into  account. 

We  begin  by  considering  the  point-RSL  problem. 

We  say  that  two  points  in  the  plcine  define  the  line  they  lie  on.    Let  U  be  a  subset  of 
M.   The  lines  of  U  axe  those  lines  defined  by  (at  least)  two  points  in  U.   It  is  convenient 
to  solve  the  following  more  general  problem. 
The  Subset-RSL  Problem.  Find  all  the  straight  lines  of  U  that  are  robust. 

Overview  of  the  algorithms  for  the  point-RSL  problem.  Algorithm  1,  below,  pro- 
vides a  framework  for  the  algorithms  in  this  section.  The  implementation  of  the  algorithm 


is  described  in  some  detail.  However,  the  main  concern  in  this  section  is  not  Algorithm  1, 
but  rather  how  to  speed  it  up,  which  is  done  in  Algorithms  2  and  3.  The  main  contribu- 
tions are:  (1)  Using  random  samipling  to  select  the  subset  U.  (2)  Using  only  a  subset  of  M 
to  determine  which  lines  of  U  are  robust.  This  subset  too  is  picked  by  random  sampling. 

3.1      Algorithm  1  -  the  Framework  Algorithm 

Algorithm  1  is  for  the  subset-RSL  problem.  The  point-RSL  problem  itself  is  solved  by 
choosing  U  =  M. 

Algorithm  1 

Step  1:  Find  the  lines  of  set  U  and  store  them  efficiently. 

Implementation  of  Step  1.  The  lines  cire  stored  by  means  of  hashing;  thus,  given  any  line, 
we  caji  check  whether  it  is  also  a  line  of  U  in  0(1)  time.  For  the  parallel  (and  perhaps 
even  the  sericil)  implementation  we  suggest  using  the  randomized  parallel  algorithm  of 
Matieis  and  Vishkin  [MV90];  it  follows  the  non-constructive  scheme  of  Fredman.  Komlos 

and  Szemeredi  [FKS84].     Storing  all   [   '^'    J   linos  takes  0(log|J7|)  time  using  Oi\U\^) 

operations  and  0(|{7p)  space.    The  time  and  operation  counts  are  expected,  while  the 
space  is  in  the  worst  case.  Finding  a  line  will  tcike  0(  1 )  operations,  in  the  worst  case.  An 
alternative  serial  implementation,  with  the  same  serial  complexity  results,  may  rely  on  the 
hashing  algorithm  of  [DKM+88].  This  heishing  cdgorithm  also  follows  [FKS84]. 
Step  2.1:  For  each  line  of  U ,  determine  how  many  points  in  U  lie  on  it. 

This  is  done  as  follows.    For  each  pair  of  points  in  U  increment  a  counter  associated 

with  the  line  they  define.  A  Hne  of  U  that  x  points  of  U  lie  on  it  is  counted  I         I  times. 


A  solution  to  a  quadratic  equation  gives  i.  Serially,  this  takes  0(|L'^|  )  operations. 

Step  2.2:  For  each  line  of  U,  determine  how  many  points  in  M  lie  on  it. 

This  is  done  as  follows.  For  each  point  x  in  ('  and  each  point  y  in  A/  —  U,  determine 
whether  the  straight  line  they  define  is  also  a  line  of  [',  cind  if  so  increment  a  counter 
associated  with  this  line.  Consider  a  line  such  that  h  >  2  points  of  U  and  2  points  of 
M  —  i7  lie  on  it.  Following  the  computation,  the  counter  for  this  line  will  store  the  number 
h  •  i.  So  it  remEiins  to  divide  this  number  by  h.  S«Tiftlly.  this  (and  the  whole  of  Step  2)  will 
take  0{\U\  ■  \M\)  operations. 

Parallel  Im,plem.entation  of  Step  2.  In  pcircillrl.  Stop  2  can  be  performed  in  expected 
O(log  |A/|)  time  and  expected  0(1^71  ■  \M\)  opcration.s.  We  discuss  Step  2.2.  The  parallel 
implementation  of  Step  2.1  is  similar.  The  pairs  of  ]K)ints  x  and  y  axe  processed  in  parallel; 
the  result  of  a  computation  on  i  and  y  is  either  the  line  L  defined  by  U  on  which  x  and 
y  he  or  the  result  "no  line."  These  |?7|(|A/|  —  \l'\)  results  are  sorted  using  the  randomized 
integer  sorting  algorithm  of  Raja^ekaran  and  Reif  [RR891.  Some  Ccire  is  needed  since 
this  sorting  algorithm  tolerates  inputs  only  from  a  restricted  domain;  e.g.,  the  domain  of 


integers  [1  •  •  •  0(|l'|  ■  |Ai^|)]-  This  restriction  is  met  as  follows.  A  property  of  the  hashing 
algorithm  of  Matias-Vishkin  (and,  actually  of  the  Fredman-Komlos-Szemeredi  algorithm) 
is  that  each  straight  line  in  U  will  be  labeled  by  a  separate  integer  between  1  and  5|t"|^. 
So,  it  suffices  to  apply  the  Rajasekaran-Reif  sorting  algorithm  to  these  labels.  Finally,  the 
number  of  occurrences  of  each  line  is  counted  using  the  prefix  sum  algorithm  of  Ladner 
and  Fischer  [LFSO].  These  algorithms  both  perform  a  linear  {0{\U\  ■  |A/|))  number  of 
operations  and  use  O(log  |A/|)  time. 
We  have  shown: 

Theorem  1  Suppose  that  u  lines  of  U  are  robust.  Then  the  above  algorithm  will  find  all 
of  them  m  expected  0{\U\  ■  \M\)  tim,e  using  0{\U\  )  space.  In  addition,  there  is  a  paral- 
lel implementation  of  the  algorithm  which  runs  in  expected  0(log|A/|)  time  and  expected 
0{\U\  ■  |A/|)  operations,  using  0{\U\  )  space. 

Comment.  Hashing  in  combination  with  floating  point  arithmetic  is  problematic.  So  we 
imagine  these  algorithms  being  used  for  integer  or  fixed  precision  coordinates  to  which 
integer  arithmetic  caji  be  applied. 

3.2      The  Improved  Algorithms 

Algorithm  2 

In  order  to  achieve  greater  efficiency,  rather  than  choose  U  —  M,  U  is  selected  by  random 
sampling  in  a  preprocessing  Step  0.  So  let  p  be  a  probability  pEirameter,  p  =  (21nm  +  2)/^ 
(implicitly,  we  axe  assuming  t  >  21nm  +  2). 

Step  0:  Select  each  point  in  M  with  (independent)  probability  p  into  a  set  U.  Note  that 
U  has  expected  size  pm  (which  is  at  least  (21nm  +  2). 

Otherwise,  Algorithm  2  proceeds  as  Algorithm  1. 

The  next  lemma  and  corollary  follow  from  Propositions  1  amd  2,  respectively. 

Lemma  3  Let  A  denote  a  robust  line.  Then,  the  probability  that  at  least  two  points  that 
lie  on  A  were  selected  for  U  (formally,  \A  D  U\  >  2)  is  at  least  1  —  1/m. 

Proof.  We  apply  Equation  2  (from  Proposition  1)  with  0  =  (tp  —  l)/tp.  Since  |A|  >  /, 
Prob{\A  n  {7|  <  1)  is  at  most 

{tp-^tp  tp-2.  ^         ,     ,        ,.,, 

^^P^ (t   )2     ~^  -  ^^P^~ — 9~)  -  exp(-lnm)  <  1/m 

The  lemma  follows.  • 

Corollary  4  Suppose  that  there  are  r  robust  lines.  Then,  all  of  them  will  be  defined  by  U 
with  probability  at  least  1  —  r/m. 


Theorem  2   Suppose  thai  r  lines  of  M  are  robust.    Then: 

1.  Algorithm  2  will  find  all  of  them  with  probability  at  least  1  —  r/m. 

2.  The  algorithm,  runs  m  expected  0{\U\  ■  \M\)  time  using  0{\U\  )  space. 

3.  There  is  a  parallel  implementation  of  the  algorithm  that  runs  m  expected  O(log  \M\) 
time  and  expected  0{\U\  ■  \M\)  operations,  using  0{\U\^)  space. 

Comment.  Larger  values  of  p  result  in  smaller  probabilities  of  failure  at  the  cost  of  a 
larger  running  time.  For  note  that  the  expected  size  of  U  is  a  linear  function  of  p. 

Algorithm  3 

Steps  0,  1  and  2.1  are  as  in  Algorithm  2. 

We  apply  the  sampling  process  in  Step  2.2,  also.  However,  there  is  no  longer  a  guarantee 
of  clearcut  results  for  lines  which  have  a  number  of  points  close  to  the  threshold.  Instead, 
with  high  probability,  the  algorithm  reports  all  hnes  on  which  at  least  t  points  lie;  it  will 
also  report  some  lines  on  which  close  to  t  points  lie.  This  is  made  precise  below.  First,  we 
describe  the  modified  aigorithm. 

A  more  efficient  implementation  of  Step  2.2.  The  idea  is  to  sample  a  subset  M'  of  M 
uniformly  at  random  and  then  estimate  which  lines  of  U  are  robust  based  on  how  many 
points  of  M'  lie  on  them. 

Step  2.2.1*:  Select  each  point  in  M  with  (independent)  probability  q  into  a  set  A/'.  Note 
that  the  expected  size  of  M'  is  9|A/|. 

Step  2.2.2*:  For  each  point  x  in  U  ajid  each  point  y  in  M',  find  whether  the  straight  line 
they  define  is  also  a  line  of  U,  and  if  so  increment  a  counter  associated  with  this  line. 
Serially,  this  will  take  0(|t^|  •  \M'\)  operations.  In  parallel,  this  can  be  done  in  expected 
O(log  \M'\  +  log  |L''|)  time  and  expected  0(1^^  •  |A/'|)  operations,  using  the  method  of  Step 
2.2  above. 

^From  Proposition  1,  we  deduce: 

Lemma  5  Let  A  denote  a  line.  Suppose  some  s  points  m  M'  lie  on  A.  Let  0  and  q  be 
constants  with  0  <  /?  <  1  and  0  <  g  <  1. 

1.  If  A  is  robust,  then  5  >  (1  —  0)tq  with  probability  at  least  1  —  exp{  —  l3^tq/2). 

2.  If  s  >  {\  •\-  0)tq,  then  A  is  robust  with  probability  at  least  1  —  exp{  —  0^tq/3). 

S.   If  s  >  {I  —  0)tq,  then  at  least  (1  —  7t4)'  points  in  M  lie  on  A  with  probability  at  least 
l-exp{-0'\^^tq/3). 

This  lemma  suggests  using  the  threshold  test  of  being  incident  on  at  least  (1  —  0)tq 
points  in  A/'  to  identify  robust  lines.  Clause  (1)  shows  that  a  certificate  of  certainty  Ccui 
be  attached  to  lines  that  exceed  a  threshold  of  (1  +  0)tq  points  and  clause  (3)  indicates 
which  additional  non-robust  lines  may  be  reported. 

In  particular,  choosing  M'  =  U  (ajid  so  p  =  q)  and  using  Proposition  2  we  obtain: 
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Corollary  6   Suppose  that  r  lines  of  M  are  robust.    Then: 

1.  The  above  algorithm  will  find  any  one  line  with  probability  at  least  1  —  exp{  —  0^tp/2). 

2.  It  reports  all  r  robust  lines  with  probability  at  least  1  —  rexp(  — /9^<p/2). 

S.    It  also  reports  any  given  line  with  fewer  than  (1  —  }t|)'  points  with  probability  at 

most  exp(  —  3^^j^tp/3). 
4-    In  addition,  it  reports  no  lines  with  fewer  than  (1  —  Irf  )^  points  with  probability  at 

least  1  -  m^exp{-l3^\^tp/3). 

5.  The  algorithm  runs  in  0{\U\^  +  \M\)  time  using  0{\U\   )  space. 

6.  There  is  a  parallel  implementation  of  the  algorithm  that  runs  m  expected  0(log|Af|) 
time  and  expected  0{\U\'^  +  l-^^l)  operations,  using  0(11^   )  space. 

Remark  1.  To  give  a  more  intuitive  feeling  for  the  meaning  of  Corollary  6,  we  substitute 
some  example  numbers  for  the  parameters.  Suppose  m  =  10^  and  t  =  10''.  Choosing 
p  =  ^  and  /?  =  3  yields  exp(  — /3^ j^<p/3)  <  ^.  This  means  that  any  given  robust  line 
is  missed  with  less  than  5%  probability,  and  any  Hne  which  is  feir  from  robust  (on  which 
fewer  than  (1  —  Y^)t  =  \  points  lie)  is  reported  with  at  most  5%  probabihty.  However, 
as  rri^  is  large,  and  likewise  if  r  is  large,  the  probability  of  some  error  need  not  be  small. 
Nonetheless,  as  we  discuss  is  Remark  4,  below,  the  results  provided  by  the  algorithm  may 
still  be  useful. 

Additionally,  it  is  our  behef  that  bounds  of  Corollary  6  substantially  overstate  the 
probability  of  error.  Several  simulations  were  run  to  confirm  this.  The  following  values 
were  experimented  with:  m  =  2000,  t  =  250  and  p  in  the  range  between  0.025  and  0.15. 
Robust  lines  were  missed  in  only  one  out  of  forty  experiments.  Also,  for  the  smzdler  values 
of  p  in  this  range,  some  non-robust  lines  were  (wrongly)  reported  as  robust.  However,  in 
general  robust  hues  included  more  sample  points  than  non-robust  lines. 
Remark  2.  Another  approach  is  to  filter  the  Unes  reported  using  this  algorithm,  by 
checking  each  line  reported  against  the  full  point  set  M.  This  is  efficient  only  if  there 
are  relatively  few  lines  on  which  close  to  t  points  lie.  Indeed,  we  could  have  a  series  of 
stages  in  which  larger  and  larger  sample  sets  M'  are  used  to  check  the  remaining  doubtful 
lines;  clauses  (1)  and  (2)  of  the  above  lemma  can  be  used  to  exclude  hnes  for  which  the 
result  (robust/not  robust)  is  known  with  high  probabihty.  (We  note  that  in  Algorithm 
1  we  achieved  a  runtime  of  0(|t^|  •  |M|),  because  each  point  of  M  could  lie  on  at  most 
one  line  defined  by  each  point  of  U\  this  yields  a  saving  of  a  multiplicative  factor  of  |t/| 
over  the  runtime  of  the  naive  algorithm,  which  compares  each  point  of  M  against  each  of 
the  |J7p  lines.  We  do  not  achieve  similar  gains  over  the  naive  algorithm  in  verifying  the 
remaining  lines  during  the  filtering  process;  the  verification  algorithm,  in  each  phase,  runs 
in  time  equcd  to  the  minimum  of  Odt'^l  •  \M'\)  and  0{u'  ■  |M'|),  where  u'  is  the  number 
of  lines  at  hand  emd  M'  is  the  point  set  being  considered.)  In  a  peirallel  implementation 
the  following  bounds  can  be  achieved:  the  minimum  of  expected  0(log|M'|)  time  and 
expected  0(|I7|  •  \M'\)  operations  (using  the  method  described  eeirUer)  eind  of  0(log  \M'\) 


time  and  0[u'  ■  \M'\)  operations  (using  the  obvious  naive  cilgorithm  to  compare  each  hne 
with  each  point  of  M'  and  the  prefix  sum  algorithm  of  Ladner  and  Fischer  [LFSO]  to  collate 
the  results). 

Remark  3.  Another  approach  to  the  filtering  problem  is  based  on  point  location  algo- 
rithms [DL76,STS6].  Again,  let  u'  be  the  number  of  lines  reported  as  having  close  to  t 
points  lying  on  them.  The  following  procedure  is  used.  Build  a  point  location  data  struc- 
ture for  these  u'  hnes.  Such  a  data  structure  can  support  the  following  type  of  query:  given 
a  query  point  it  reports  the  line(s)  on  which  the  query  point  lies,  or  which  Hnes  lie  imme- 
diately above  and  below  the  query  point;  further,  answering  a  query  takes  0{\ogu')  time. 
The  data  structure  of  [ST86]  requires  0((u')^)  space  and  can  be  built  in  0((ii')^  log  it') 
time,  while  that  of  [DL76]  requires  0((u')-')  space  and  can  be  built  in  0((u')^  log  u')  time 
(however,  the  latter  data  structure  is  particularly  simple).  Given  this  data  structure,  for 
each  point  of  M  in  turn,  determine  on  which  of  the  u'  lines,  if  any,  it  lies.  For  each  of 
the  u'  lines  maintain  counters  of  how  many  points  of  M  lie  on  them.  The  querying  takes 
0(|-V/|  log  u'  +  iu'Y)  time  overall.  For  each  query  taJces  time  O(log  u')  plus  0(the  number  of 
lines  reported).  But,  as  all  the  points  of  M  axe  distinct,  the  total  number  of  multiple  lines 
reported  in  queries  can  be  at  most  0((u')^),  since  each  intersection  point  of  the  u'  lines 
is  reported  at  most  once  eind  the  totad  number  of  such  points,  even  counted  according  to 
their  multiplicity  (i.e.,  the  number  of  lines  meeting  in  the  intersection  point)  is  0((u')^).  So 
this  approach  yields  a  verification  algorithm  with  running  time  0(  |A/|  log  u'  +  (u')^  log  u'), 
which  is  efficient  if  u'  is  relatively  small.  Next,  a  parallel  implementation  is  described.  It 
uses  the  plajiar  point  location  data  structure  of  Atallah  et  al.  [ACG89],  which  can  be  con- 
structed in  O(logu')  time  and  0{{u')^  \ogu')  operations  and  answers  queries  in  O(logu') 
time.  The  query  is  answered  in  two  phases:  in  the  first  pheise,  the  number,  n,,  of  lines  on 
which  the  query  point,  q,  lies  is  determined;  then,  each  point  q  is  allocated  n,  processors, 
each  processor  being  responsible  for  reporting  a  separate  line  on  which  q  lies.  (This  process 
is  not  described  explicitly  in  [ACG89]  but  the  algorithm  given  there  is  readily  modified 
to  support  this  form  of  response.)  The  set  of  query  point/line  incidences  is  now  sorted 
(eg.  by  Cole's  algorithm  [ColSS]),  using  the  line  label  as  the  key;  next,  using  a  prefix  sum 
algorithm  ([LFSO]),  for  each  line,  the  number  of  query  points  on  the  line  is  determined. 
Overall,  this  requires  0(log  u')  time  and  0{\M'\  log  u'-t-(u')^  log u')  operations.  It  should  be 
noted  that  the  parallel  planar  point  location  algorithm  of  Atailali  et  al.  is  quite  elaborate; 
a  simpler  algorithm,  which  achieves  somewhat  less  parallelism,  is  given  in  [AG86]. 
Remark  4.  The  performance  of  our  cdgorithms  degrades  as  r,  the  number  of  robust  lines, 
increases.  We  feel  that  our  algorithms  can  be  useful  even  if  they  find  only  a  majority  of 
the  lines  that  exist  in  some  scene.  For  we  intend  our  algorithms  to  provide  ideas  which  will 
be  incorporated  within  softwau"e  systems  for  scene  anedysis.  Scene  analysis  can  be  viewed 
as  a  search  problem  over  a  huge  domain.  Even  an  initial  detection  of  a  few  robust  lines 
caji  reduce  drastically  the  complexity  of  the  remaining  problem.  The  remedning  problem 
might  be  dealt  with  in  an  interactive  fcishion  with  human  intervention,  or  automatically  by 
the  software  system  itself.   We  also  note  that  the  point- RSL  problem  taikes  no  advantage 
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of  the  fact  that  an  image  is  being  processed;  presumably  there  is  semantic  information 
to  be  exploited.  This  however,  is  outside  the  problem  domain  we  are  considering.  But  it 
indicates  that  the  role  of  the  point-RSL  algorithm  should  be  to  find  a  substantial  portion 
of  the  robust  lines  in  order  that  further  processing  can  proceed  effectively. 

3.3      The  Segment-RSL  Problem 

This  problem  is  solved  using  algorithm  1  with  two  modifications.  First,  the  set  S  of  line 
segments  replaces  the  lines  of  point  set  U.  Second,  in  Step  2,  for  each  line.  A,  containing 
a  line  segment  of  5,  a  length  counter  is  maintained:  it  accumulates  the  total  length  of  all 
hne  segments  in  S  lying  on  A.  Clearly,  the  algorithm  takes  0(|S|)  time. 

Sampling  is  not  used  here,  but  randomization  is  still  present  through  the  use  of  hash 
functions. 

In  a  parallel  implementation  it  is  not  clear  how  to  carry  out  Step  2  in  0(  |S|)  operations. 
So  we  give  an  implementation  that  performs  0(|S|log|S|)  operations  in  0(log|S|)  time. 
This  implementation  does  not  use  any  randomization.  Cleaxly,  em  analogous  sequentizJ 
algorithm  could  be  given. 

Step  1.  Sort  the  line  segments  of  S  by  slope  and  by  their  intersection  with  the  x-eixis  (this 
ensures  that  segments  lying  on  a  common  line  eu-e  adjacent  in  the  sorted  order). 
Step  2.  Determine  the  length  of  the  segments  on  each  common  line  by  means  of  a  prefix 
sum  algorithm. 

Using  Cole's  sorting  algorithm  [Col88]  for  Step  1  and  the  Ladner-Fisher  prefix  sums 
algorithm  [LF80]  for  Step  2  achieves  the  claiimed  complexity. 

We  have  shown: 

Theorem  3  There  is  a  sequential  algorithm  for  the  segmeni-RSL  problem  which  runs  m 
expected  0{\S\)  time.  Likewise  there  is  a  parallel  algorithm  for  this  problem  which  performs 
0(|5|log|S|)  operations  m  0(log|5|)  time. 

4     e-Robust  Straight  Lines 

We  proceed  to  the  two  other  main  problems  couMdered  in  the  present  paper. 

Let  A/  be  a  set  of  m  points  in  the  plane,  a.s  Ix-fore.  Let  .4  be  any  straight  line  in  the 
plane.  A  point  in  the  plane  is  e-near  to  A  if  its  normal  or  Euclidean  distance  from  A  is  at 
most  c,  for  some  e  >  0.  A  is  called  e-robust  if  at  least  t  points,  for  some  ^  >  0,  axe  e-neai 
to  A. 

The  Point  e-Robust  Straight  Lines  (Point  ?-RSL)  Detection  Problem. 
Input.  Set  M  of  points,  threshold  t  and  f  >  0. 
Problem.  Find  all  straight  Hnes  that  are  e-robust. 
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Likewise,  let  5  be  a  set  of  s  line  segments  in  the  plane.  Let  A  be  any  straight  line  in 
the  plane.   A  line  segment  a  is  €-near  to  A  if  every  point  on  cr  is  f-near  to  ,4.   A  is  called 
e-robust  if  segments  of  total  length  at  least  /  are  e-near  to  A.  Possible  alternate  definitions 
of  the  notion  of  e-robust  for  line  segments  will  be  discussed  later.  Again,  it  will  always  be 
cleeir  from  the  context  in  which  sense  the  term  6-robust  is  being  used. 
The  Segment  e-Robust  Straight  Lines  (Segment  e-RSL)  Detection  problem. 
Input.  Set  S  of  line  segments,  threshold  /  and  e  >  0. 
Problem.  Find  all  straight  lines  that  are  e-robust. 

Where  it  is  clear  from  the  context  that  the  e-RSL  problem  is  being  discussed,  we  omit 
the  prefix  e.  That  is,  we  simply  say  that  a  point  or  line  segment  lies  on  a  straight  line  and 
that  a  line  is  robust. 

The  main  contribution  of  this  section  is  a  so-caJled  geometric  algorithm  for  the  point- 
RSL  problem;  it  reports  all  the  (regions  of)  robust  lines  in  time  0(m'^).  Later,  we  discuss 
how  this  algorithm  cam  be  speeded  up  using  random  sampling  (at  some  cost  in  certainty). 

4.1      The  Geometric  Algorithm  for  the  Point-RSL  problem 

In  the  description  of  the  algorithm  that  follows,  the  region  containing  the  points  e-near 
to  a  line  .4  is  called  the  radius  e  cylinder  with  axis  A.  (Note  that  this  cylinder  is  two 
dimensional.) 

A  robust  line  corresponds  to  a  cylinder  of  radius  e  that  encloses  at  least  /  points:  call 
such  a  cyhnder  a  robust  cylinder.  In  general,  there  may  be  infinitely  many  solutions  to  the 
RSL  problem;  in  the  next  three  paragraphs,  we  discuss  how  these  solutions  are  provided 
as  output  by  the  algorithm. 

Form  of  the  output.  The  output  is  provided  in  the  dual  (r,  ^)  space  described  earlier, 
denoted  D.  Recall  that  each  Hne  is  mapped  to  two  points:  {r,6),  as  defined  in  Section  2, 
and  {—r,6  +  ir). 

The  following  remark  may  help  to  illustrate  the  role  of  this  mapping  in  ensuring  con- 
tinuity. Consider  a  straight  line  with  a  fixed  orientation  that  is  advanced  from  the  left 
side  of  the  origin  to  the  right  side;  as  the  line  is  advanced,  the  mapping  {r,6),  r  >  0,  of 
the  line  changes  by  continuously  reducing  r;  when  the  line  crosses  the  origin,  the  mapping 
becomes  (— r,  d)  with  r  >  0  still,  and  then  a  continuous  increase  of  r  follows. 

The  output  is  provided  as  a  set  of  regions,  which  between  them  contain  the  images 
of  the  cLxes  of  all  the  robust  cylinders.  Each  axis  will  appear  twice:  once  for  each  of  its 
images.  Each  region  is  given  by  providing  its  boundary,  comprising  a  series  of  continuous 
curves,  meeting  at  vertices.  Each  curve  is  specified  by  providing  its  equation.  See  Remark 
6  below  for  a  further  discussion  of  the  output  form. 
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4.1.1      Overview  of  the  Geometric  Algorithm 

An  image  perimeter  will  mean  the  perimeter  of  a  rectangle  that  contains  all  input  points; 
it  will  be  convenient  to  select  a  large  enough  containing  rectangle,  so  that  the  distance 
between  each  point  2Lnd  an  edge  of  the  rectangle  is  larger  than  e.  It  is  also  convenient 
to  choose  the  origin  to  be  the  lower  left  hand  corner  of  this  rectangle  (so  that  all  the 
image  points  have  positive  coordinates).  First,  we  consider  an  anchored  instance  of  the 
RSL  problem,  where  the  cylinder  axis  is  required  to  pass  through  a  fixed  point  on  the 
image  perimeter,  called  the  anchor.  To  find  the  robust  cylinders,  we  rotate  the  axis  of 
the  cylinder  about  the  given  zinchor,  keeping  track  of  the  points  as  they  enter  and  leave 
the  rotating  cylinder.  The  rotationed  order  of  the  point  instances  partitions  the  angular 
interval  [0,11]  into  contiguous  angular  intervals  (AIs),  as  follows.  Each  robust  cylinder 
whose  axis  lies  within  the  SEime  angular  interval  contains  the  same  input  points.  Consider 
an  anticlockwise  rotation  of  a  radius  e  cylinder  about  the  anchor  (the  rotation  corresponds 
to  traversing  the  angular  interval  [0,11]).  For  1  <  ?  <  m,  input  point  p,  enters  the  cyHnder 
at  some  (entering)  angle  e,  and  leaves  the  cylinder  at  some  larger  (leaving)  angle  /,.  It  is 
convenient  to  define  the  entering  boundary  of  the  cylinder  to  be  the  boundary  crossed  by 
p,  at  the  entering  angle  e,;  note  that  the  entering  boundary  is  a  straight  line  that  maps  to 
a  point  in  the  dual  space;  the  leaving  boundary  is  defined  analogously.  We  conclude  that: 
Observation.  There  are  at  most  2m  +  1  angulax  intervals. 

Counting  the  number  of  points  contained  in  the  cylinders  associated  with  each  angular 
interval  is  easy:  associate  a  count  of  +1  with  each  entering  axigle  e,  ajid  —1  with  each  leaving 
angle  /,;  accumulate  the  counts  while  traversing  the  auigulcir  interval  [0,11]).  (The  above 
adgorithm  for  the  anchored  RSL  problem  can  be  implemented  to  run  in  time  O(mlogm); 
we  do  not  detail  it,  since  we  are  not  interested  in  this  special  case.) 

For  the  sequel,  we  need  only  remember  that  the  order  of  the  2m  (entering  and  leaving) 
angles  determined  the  number  of  points  in  each  cylinder  whose  axis  fell  in  a  given  angular 
interval. 

We  axe  ready  to  consider  the  RSL  problem  itself.  In  a  nutshell,  our  algorithm  proceeds 
as  follows:  Maintain  the  solution  for  the  anchored  RSL  problem,  while  advancing  the  anchor 
counterclockwise  around  the  image  perimeter. 

Specifically,  as  the  anchor  is  advanced  around  the  perimeter,  we  show  how  to  maintain 
the  order  of  the  (entering  and  leaving)  angles.  This  order  partitions  the  image  perimeter 
into  contiguous  penmeier  intervals  (PI),  as  follows.  For  each  anchor  in  the  same  perimeter 
interval  the  order  of  the  (entering  and  leaving)  ajigles  is  the  same.  Each  point  on  the 
perimeter,  where  the  order  of  an  entering/leaving  cmgle  and  ajiother  entering/leaving  angle 
changes  is  cedled  a  critical  anchor. 

Next,  we  give  full  characterization  of  critical  anchors.  We  have  two  cases. 
Case  1.   See  Figure  2.    Consider  two  input  points  p,  and  pj  whose  Euclidean  distance  is 
leirger  than  2c.    Suppose  that  the  straight  hne  L,j  defined  by  these  two  points  intersects 
the  image  perimeter  at  points  a  and  6,  and  that  p,  is  closer  to  a  thcin  pj  (implying  that 
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Figure  2:  Critical  anchors  q  and  /? 

Pj  is  closer  to  b  than  p, ).  The  pair  of  points  a  and  h  partitions  the  containing  rectangle 
(whose  perimeter  is  used  for  the  image  perimeter)  into  two  "halves".  Advance  from  u 
to  b  counterclockwise  to  define  the  first  half-perimeter.  We  focus  below  on  the  first  half- 
perimeter.  All  statements  can  be  adapted  to  hold  for  the  second  half-perimeter,  a^  well. 
Consider  the  solution  to  the  anchor  RSL  problem  with  respect  to  anchor  a  and  suppress 
all  entering/leaving  angles  which  relate  to  points  other  than  p,  and  pj.  The  relative  order 
of  the  entering/leaving  emgles  of  points- p,  and  p^  will  be  e,,ej,/_,  and  /,.  There  is  exactly 
one  straight  line  whose  distance  from  £,j  is  e  and  which  intersects  the  first  half-perimeter. 
Its  two  intersection  points  with  this  half-perimeter  (in  counterclockwise  order)  are  denoted 
Q  and  /?. 

Lemma  7   Point  a  is  a  critical  anchor  since  the  order  of  e,  and  e_,  changes  there. 

Lemma  8   Point  (3  is  a  critical  anchor  since  the  order  of  I,  and  Ij  changes  there. 


Co 


:2e 


c, 


6  b 

Figure  3:  Critical  anchors  7  and  S 

See  Figure  3.  Consider  a  radius  e  cylinder  such  that  point  p,  lies  on  one  of  its  boundaries 
and  point  pj  lies  on  its  other  boundary.  There  are  exactly  two  such  cylinders,  denoted  C\ 
and  C2.  The  aais  of  each  of  Cj  and  C2  has  one  intersection  point  with  the  first  half- 
perimeter.  Denote  these  two  intersection  points  (in  counterclockwise  order)  by  -y  and  6. 
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Lemma  9   Point  7  is  a  critical  anchor  since  the  order  of  e,  and  Ij  changes  there. 

Lemma  10   Point  S  is  a  critical  anchor  since  the  order  of  c,  and  Ij  changes  there. 

Case  2.   The  Euclidean  distance  between  two  input  points  p,  and  pj  is  at  most  2e.   The 
only  critical  anchors  on  the  first  half-perimeter  are  the  points  a  and  3.    (Points  -   and  6 
are  undefined.) 
We  conclude: 

Theorem  4    The  pair  of  points  p,  and  Pj  defines  at  most  eight  critical  anchor.-.    The  total 
number  of  critical  anchors  is  at  m.ost  4m(m  —  1). 

Now,  we  discuss  precisely  how  the  robust  cylinders  are  reported.  The  goal  is  to  trace 
the  boundaries  separating  angular  intervals  representing  e  cylinders  containing  t  points 
and  e  cylinders  containing  t  —  I  points.  We  compute  the  images  of  these  boundaries  in 
the  dual  space  D;  they  comprise  the  boundaries  for  the  regions  of  robust  axes  (or  more 
strictly  for  the  images  of  these  regions  in  the  dual  space  D). 

Consider  one  such  boundary  for  a  given  perimeter  interval.  Such  a  boundary  is  defined 
by  an  image  point  p  =  (a,  b)  on  which  "just  robust"  cylinders  are  incident;  specifically,  the 
axis  of  the  corresponding  "just  robust"  cylinder  is  at  distance  e  from  p.  The  image  of  the 
set  of  lines  incident  on  p  (also  called  the  image  of  p)  in  the  dual  space  D  is  the  curve 

r  =  asin^  -  6cos^  (4) 

Two  curves  might  be  expected,  but  in  fact  points  (r,6)  and  {—r,6  +  tt)  belong  to  the  same 
curve. 

Next,  we  determine  the  equations  of  the  e  cylinder  axes  for  which  point  p  is  on  the 
cylinder  boundary.  There  are  two  cases. 
Case  1.  The  aixis  is  above  point  p  on  the  cylinder  boundary.  (See  Figure  4.)       Then  the 


cylinder 
boundary 


equation  is 


cylinder 
axis 


Figure  4:  Deriving  the  equation  for  a  cylinder  axis 

r  —  €  =  a  sin  ^  —  6  cos  6 
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To  see  this  note  that  if  the  cyUnder  boundciry  has  parameters  (r'.  9).  then  the  axis  has 
parameters  (r'  +  e.6}  =  (r.  6):  also  (r',  6)  satisfy  equation  4. 
Case  2.  The  axis  is  below  the  boundary.  Then  the  equation  is 

r  -f-  €  =  a  sin  6  —  b  cos  9 

Point  p  separates  two  angular  interx-als  with  counts  of  t  and  t  —  1.  respectively,  for  some 
portion  of  the  perimeter  delimited  by  two  critical  anchors.  The  e  cylinders  for  which  p  is 
on  the  cylinder  boundary  are  represented  in  the  dual  space  D  by  a  segment  of  the  curves 
defined  in  Cases  1  and  2  above  (due  to  the  output  representation  there  are  actually  two 
segments).  The  endpoints  of  the  segment  are  the  images  in  D  of  the  a^es  through  the 
critical  anchors.  These  endpoints  are  computed  as  the  corresponding  critical  ajichors  are 
traversed. 

The  following  obserN-ation  clarifies  the  way  in  which  the  ducJ  space  is  partitioned. 
Observation.  Suppose  the  image  points  are  in  general  position  (i.e..  no  three  points 
£ire  coUinear  or  on  a  pair  of  parallel  lines  distance  2e  apart).  Then  at  each  endpoint  two 
segments  that  belong  to  the  same  boundary  curve  meet.  There  are  three  ways  this  can 
occur.  The  endpoint  can  terminate  one  segment  and  begin  a  second  one  (with  respect  to 
the  traversal);  the  endpoint  can  terminate  two  segments,  or  the  endpoints  can  begin  two 
segments.  If  the  general  position  assumption  does  not  hold  then  more  than  two  segments 
may  meet  at  a  single  endpoint,  through  collapsing  several  endpoints  into  a  single  endpoint. 

The  remaining  issue  is  to  determine,  in  D.  which  side  of  the  segment  contains  the 
region  of  robust  axes:  but  this  is  readily  done.  If  the  angulcLr  interv-al  with  count  t  begins 
(resp.  ends)  when  point  p  is  encountered  (in  the  anticlockwise  rotation  about  the  anchor) 
then  the  side  of  the  segment  corresponding  to  larger  (resp.  smaller)  \-alues  of  9  is  the  side 
containing  the  robust  axes. 

The  reason  for  describing  the  regions  of  robust  axes  in  the  dual  space  is  that  a  region 
of  lines,  in  our  opinion,  is  more  compact  if  descril>ed  as  a  set  of  points;  if  it  is  intended  to 
present  the  output  on,  for  instzince,  a  terminal,  the  presentation  in  the  dual  space  is  much 
more  immediate  to  the  viewer. 

This  completes  the  overview.  We  remark  that  this  approach  is  based  on  a  solution  to 
the  geometric  retrieval  problem  due  to  Cole  and  ^'ap  [CYS5]. 

Remark  5.  It  is  tempting  to  determine  that  only  one  of  the  two  images  {r.9)  and  (-r,^  + 
tt)  of  a  hne  is  required.  The  difficulty  is  that  wr  would  like  to  obtain  continuous  regions 
of  robust  axes,  which  entails  avoiding  discontinuities  at  r  =  0.  The  problem  is  that  we 
might  not  choose  consistent  mappings  for  different  boundary  segments.  For  a  region  may 
have  holes;  in  this  situation  it  is  clear  that  inconsistent  mappings  for  the  boundaries  of 
the  region  and  its  hole  are  very  undesirable.  (With  sufficient  care,  this  difficulty  may  be 
avoidable:  only  commit  to  the  mapping  at  the  ver>-  end  of  the  computation,  at  which 
point  consistency  of  the  mappings  can  be  ensured.  However,  it  seems  quite  intricate  to 
determine  exactly  what  needs  to  be  done;  also  this  has  a  considerably  more  adhoc  feel  to 
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it.  Further,  one  might  want  to  identify  nearby  regions,  for  example:  it  is  not  clear  that 
this  more  intricate  approach  would  lend  itself  to  such  a  query.) 

4.1.2      Implementing  the  Algorithm 

We  give  two  implementations;  in  this  section  we  describe  a  method  that  requires  0{m'^  log  m ) 
time.  In  Section  4.1.4  we  give  an  alternative  method  that  achieves  O(m^)  time.  In  Section 
4.1.3,  we  describe  a  parallel  algorithm  that  performs  O(m^logm)  operations  in  Ollogm) 
time. 

Step  1.  Find  all  critical  anchors.  This  is  straightforward;  it  can  be  done  sequentially  in 
0{m^)  time  emd  in  parallel  in  time  0(1)  using  Oirn^)  operations  (one  processor  is  assigned 
to  each  pair  of  points). 

Step  2.  Sort  the  critical  anchors  (along  the  image  perimeter).  Sequentially,  this  takes 
O(m^logm)  time;  in  parallel,  using  Cole's  algorithm  [ColSS],  this  takes  0(m^  log  m)  oper- 
ations and  0{\ogm)  time. 

Step  S.  Ad\-ance  the  anchor  around  the  perimeter  in  anticlockwise  order,  determining 
the  boundaries  of  the  regions  of  robust  axes,  as  follows.  A  data-structure  that  supports 
dictionary  operations  (i.e.,  FiND,  I.NSERT  and  DELETE)  is  used  to  maintain  the  order  of 
the  angles  relative  to  the  current  critical  anchor;  it  is  also  used  to  maintain  the  cumulative 
counts.  (For  a  sequential  implementation,  one  simple  and  efficient  data  structure  is  the 
splay  tree  [ST85].)  The  order  at  the  next  criticcd  anchor  will  differ  only  by  the  interchange 
of  two  angles  (entering  and  leaving  angles,  that  is).  We  need  to  keep  track  of  the  boundary 
between  angular  inter\'als  whose  cylinders  contain  t  and  ^  —  1  points,  respectively.  As.  for 
each  leaving/entering  eingle,  we  associate  the  angular  interval  ahead  of  it  in  anticlockwise 
order,  we  Eire  seeking  entering  angles  with  a  count  of  t  and  leaving  angles  with  a  count  of 
t  —  1.  But  maintaining  the  counts  is  straightforward  since  the  only  count  that  changes  is 
the  one  for  the  angular  interval  between  the  two  angles  that  are  swapped. 

The  cylinder  axis  associated  with  an  entering  angle  e  with  count  t  determines  a  point 
on  the  boundary  of  a  region  of  robust  axes  in  the  dual  space  D  (due  to  the  output  repre- 
sentation, there  are  actually  two  points).  The  cylinder  axis  is  determined  by  some  input 
point  p.  sustending  the  cingle  e;  the  image  of  p  in  D  is  a  curve;  a  point  on  the  translation 
by  €  of  this  curve  is  the  image  of  the  cylinder  axis.  As  the  anchor  is  ad\-anced.  the  image 
in  D  of  the  cylinder  axis  comprises  a  sequence  of  points  which  form  a  segment  of  the 
following  curve:  the  translation  by  e  of  the  image  of  p.  The  criticed  anchor  at  which  the 
count  of  entering  angle  e  chemges  determines,  in  D,  an  endpoint  of  the  segment  defined 
by  p  (more  precisely,  this  endpoint  is  the  image  in  D  of  the  cyhnder  axis  for  the  following 
cylinder:  p  is  on  its  entering  boundary  cind  its  axis  passes  through  the  critical  anchor). 
Similar  correspondences  apply  to  the  leaving  eingles.  So  for  each  entering  (resp.  leaving) 
emgle  with  a  count  of  t  (resp.  /  —  1)  we  keep  track  of  the  corresponding  segment  in  the 
ducd  space,  and  update  this  information  as  critical  cmchors  are  crossed.  At  the  end  of  the 
tour  of  the  image  perimeter,  complete  boundeiries  for  each  region  of  robust  axes  will  have 
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been  specified. 

Implementation  Comment.  Each  line  that  intersects  the  perimeter  of  the  containing 
rectangle  must  intersect  it  at  two  points.  The  line  will  be  reported  twice  (once  through 
each  of  these  points).  To  avoid  this,  we  break  the  processing  of  the  perimeter  into  four 
cases,  as  follows:  (Case  1)  For  the  upper  edge  of  the  perimeter  —  all  its  intersecting  lines 
are  considered.  (2)  For  the  left  edge  of  the  perimeter  —  only  those  intersecting  lines  that 
do  not  intersect  the  upper  edge  are  considered.  (3)  For  the  lower  edge  of  the  perimeter  — 
only  those  intersecting  lines  that  do  not  intersect  the  upper  and  left  edge  are  considered. 
(4)  For  the  right  edge  of  the  perimeter  —  none  of  its  intersecting  lines  are  considered.  It 
should  be  clear  that  each  such  line  will  be  considered  exactly  once. 

4.1.3      A  Parallel  Algorithm 

To  obtain  m-way  parallelism,  carry  out  m  instances  of  the  above  sequential  algorithm. 
starting  at  every  4(m  —  l)th  critical  anchor. 

Considerably  more  efficient  pEirallelism  can  be  achieved,  but  a  little  less  easily.  Each 
segment  delimiting  a  region  of  robust  axes  in  D  is  defined  by  its  equation  and  endpoints. 
A  segment  equation  corresponds  to  a  displacement  by  e  of  the  mapping  in  D  of  a  point 
of  the  image  (as  described  earlier).  An  endpoint  of  the  segment  is  obtained  from  the 
corresponding  critical  anchor  as  follows:  let  L  be  the  cylinder  boundary  incident  on  point 
p  at  the  critical  anchor;  the  mapping  of  L  in  D  (a  point)  displaced  by  e  provides  the 
endpoint  (the  displacement  is  identicad  to  that  used  for  the  mapping  of  a  point  in  the 
previous  sentence).  Thus  there  are  two  steps  to  computing  the  segments  delimiting  the 
regions  of  robust  axes.  (1)  Determine  each  dehmiting  segment.  (2)  Connect  segments 
having  common  endpoints.  We  detail  each  in  turn. 

Step  1.  For  each  point  p,,  1  <  J  <  m,  the  segments  it  determines  are  computed  separately. 
Let  S,  be  the  set  of  critical  ajichors  defined  by  p,;  there  cire  at  most  4(m  —  1)  such  anchors. 
Let  e,  and  /,  be  the  entering  and  leaving  angles  sustended  by  p,  (as  the  anchor  advances 
around  the  perimeter).  We  need  to  determine  the  perimeter  intervals  in  which  angle  e, 
(resp.  /,)  hcis  an  associated  count  of  t  (resp.  t  —  1),  for  these  inter\'als  determine  the  "just 
robust"  cylinders  which  in  turn  determine  the  segments  delimiting  the  regions  of  robust 
axes  in  the  duaJ  space  D. 

The  counts  eissociated  with  entering  angle  e,  cind  leaving  angle  /,  are  constant  on  the 
intervals  induced  by  the  critical  anchors  in  S,.  So  it  suffices  to  compute  each  set  S,  in 
parallel,  amd  for  each  set  5,,  to  compute  the  counts  associated  with  e,  and  /,.  Recall  that 
the  count  associated  with  a  given  entering  (or  leaving)  angle  e,  is  altered  only  when  a 
chajige  occurs  to  the  position  of  e,  in  the  ordering  of  entering  and  leaving  angles.  These 
changes  occur  only  at  critical  anchors.  For  instance,  if  in  the  perimeter  interval  immediately 
prior  to  a  critical  anchor,  in  the  sorted  ordering  of  the  angles,  e,  is  followed  by  another 
entering  angle  Cj,  and  if  at  the  critical  angle,  e,  and  Cj  interchange  positions,  then  e,'s 
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count  increases  by  1  in  the  next  perimeter  interval  (the  interval  after  the  critical  anchor 
in  question);  similar  rules  apply  in  the  other  cases.  Thus,  Step  1  is  imi)lemented  in  three 
substcps,  as  follows. 

Step  1,1.  For  every  i  find  the  count  of  c,  (and  /,)  at  some  initial  anchor  location  (there 
will  be  a  single  initial  anchor  location  for  all  e,  and  /, ).  This  is  done  using  the  anchored 
RSL  algorithm  (see  the  beginning  of  section  4.1.1),  followed  by  a  prefi.x  sum  algorithm 
[LFSO].  The  anchored  RSL  algorithm  takes  (9(logm)  time  and  0(m  log  777)  operations, 
and  the  prefix  sum  algorithm  O(logm)  time  and  0{m)  operations. 

Step  1.2.  For  each  e,  (ajid  /, )  separately,  and  in  parallel,  compute  a  sorted  listing  of 
the  changes  to  the  counts  associated  with  e,.  Locations  of  changes  are  obtained  from  the 
sorted  order  of  the  (at  most)  2(m  —  1)  critical  cmchors  associated  with  c,.  Using  Cole's 
sorting  algorithm  this  takes  O(logm)  time  and  0{m\ogm)  operations.  Xcxt,  a  prefix 
sum  algorithm  (and  the  count  of  e,  at  the  initial  anchor  location)  is  used  to  determine 
all  the  counts  for  e,.  This  prefix  sum  algorithm  runs  in  O(logm)  time  performing  0(m) 
operations.  Since  Step  1.2  is  performed  for  each  e,  and  /,  it  takes  a  total  of  0{7n'  logm) 
operations  and  O(logm)  time. 

Step  1.3.  For  each  angle  e,  (resp.  /,),  relative  to  5,,  do  the  following:  each  time  a  count 
of  t  (resp.  <  —  1)  is  found,  determine  the  corresponding  delimiting  segment  for  a  region  of 
robust  axes  -  it  is  a  translation  by  e  of  a  segment  of  the  line  which  is  the  image  in  D  of  the 
point  p  sustending  angle  e,.  The  endpoints  of  the  segment  are  critical  anchors;  specifically 
they  occur  at  critical  anchors  where  the  count  cissociated  with  e,  changes;  for  each  change 
in  count  we  need  to  record  the  critical  anchor  and  the  two  points  that  induce  the  critical 
anchor.  The  computation  of  the  boundaries  of  the  regions  of  robust  axes  given  the  counts 
takes  0(1)  time  and  0{m)  operations,  or  a  total  of  O(m^)  operations  and  0(1)  time  for 
all  e,  and  /,.  Overall,  Step  1  takes  O(logm)  time  using  O(m^logm)  operations. 
Step  2.  Make  two  copies  of  each  segment,  each  copy  tagged  by  one  endpoint  (or  the  cor- 
responding critical  anchor).  Sort  the  copies  of  the  segments  using  the  tag  as  the  key,  using 
Cole's  sorting  aJgorithm,  for  instance.  Segments  that  are  to  be  connected  will  be  adjacent 
in  the  sorted  order.  So  Step  2  also  takes  O(logm)  time  using  0(m^  log  7n)  operations. 
Comment.  The  statement  of  Step  2  above  ignores  the  (degenerate)  ca^e  where  a  critical 
anchor  is  induced  by  more  than  one  pair  of  points.  To  cope  with  this  case,  we  refine  the 
tags  associated  with  each  copy  of  each  segment  to  be  pairs,  as  follows.  The  first  item  in 
a  pair  is  the  critical  anchor,  as  before;  the  second  item  is  the  angle  in  which  two  points 
induce  the  anchor.  The  previous  linear  order  on  tags  is  now  refined  to  a  lexicographic  order 
on  these  pairs.  Sorting  is  applied  again,  but  this  time  with  respect  to  the  lexicographic 
order.  Another  degenerate  ceise  arises  when  more  than  two  segments  meet  at  an  endpoint. 
Then  we  need  to  create  em  adjacency  list  for  each  endpoint.  Segments  that  have  a  common 
enpdoint  will  all  be  contiguous  in  the  sorted  order  provided  by  step  2;  this  set  of  contigu- 
ous segments  can  provide  the  adjacency  list.  (The  sorted  order  is  readily  partitioned  into 
adjacency  lists  by  cm  application  of  a  prefix  sums  algorithm;  this  takes  0(777*)  operations 
and  O(logm)  time.)  In  addition,  in  order  to  facilitate  recognition  of  regions,  it  is  helpful  to 
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have  the  segments  in  each  adjacency  Hst  angularly  sorted  about  their  common  endpoint. 
This  sort  is  readily  carried  out  with  a  further  application  of  Cole's  sorting  algorithm. 

So  far  we  have  shown: 

Theorem  5  There  is  an  algorithm,  to  determine  the  axes  of  the  e  robust  cylinders  for 
a  -point  set  of  size  m;  these  axes  are  reported  in  the  form  of  regions  m  the  dual  plane. 
Sequentially,  the  algorithm,  runs  in  0(m^  log  m)  time.  A  parallel  implementation  performs 
O(m'^logm)  operations  in  O(logTn)  time. 

4.1.4      A  more  efficient  algorithm 

We  now  give  a  more  efficient  sequential  algorithm:  it  uses  the  approach  of  the  parallel 
algorithm  above.  The  key  step  in  the  parallel  algorithm  is  the  sorting  of  critical  angles 
associated  with  each  point.  We  show  how  to  do  this  in  0{m')  time.  It  is  readily  checked 
that  the  remainder  of  the  parallel  algorithm  requires  only  0{m')  operations;  so  it  can  be 
carried  out  sequentially  in  the  sajne  time  bound. 

We  will  be  using  the  topological  sweep  method  of  Edelsbrunner  and  Guibas  [EGS6]. 
This  procedure,  given  a  collection  of  n. straight  lines,  determines  the  arrangement  of  the. 
lines,  and  so  in  particular,  for  each  line,  determines  the  order  of  its  intersections  with  the 
other  lines;  this  computation  takes  time  0{ri^).  Guibas  cind  Edelsbrunner  describe  their 
method  for  straight  lines,  but  the  method  can  be  applied  without  modification  to  other 
classes  of  lines;  of  course,  it  is  necessary  to  verify  that  the  time  bounds  still  hold.  We  will 
apply  the  Guibas  and  Edelsbrunner  method  to  2m  sinusoidal  curves,  r  =  asin^  +  bcosd, 
with  9  in  the  range  [0,2t].  We  claim  that  the  method  still  requires  0{m^)  time.  The 
key  to  the  complexity  bound  in  [EGS6]  is  the  following  lemma.  (This  explanation  is  not 
self  contained;  the  interested  reader  is  referred  to  [EGS6].)  The  following  definition  is 
helpful:  given  a  region  whose  boundary  consists  of  line  segments,  its  size  is  the  numbers 
such  segments. 

Lemma  11  In  an  arrangement  A{H)  of  n  straight  lines,  the  total  size  of  all  the  regions 
whose  boundary  contains  a  segment  of  line  I  is  0{n). 

Proof.  (This  is  a  folklore  proof;  [EG86]  cite  two  other  proofs,  namely  [CGLSo.EOSSG].) 
Consider  the  portions  of  the  lines  below  line  /,  and  the  regions  they  define  bordering  /. 
Name  these  lines  1.2,  •  •  •  ,n  —  1.  Traverse  the  regions  bordering  /,  proceeding  along  /  from 
one  end  to  the  other.  As  each  region  is  traversed,  list  the  lines  encountered  in  order.  The 
resulting  sequence  is  a  list  of  line  names.  Since  each  pair  of  lines  intersects  at  most  once, 
there  is  no  subsequence  of  the  form  abab.  (The  sequence  aba  can  occur  for  the  lines  are 
only  finite  ajid  so  a  can  be  nearer  to  /  ajid  then  b  casi  become  nearer  and  then  end  at 
which  point  a  becomes  nearer  anew.)  It  follows  that  the  sequence  is  a  Davenport-Schinzel 
sequence  of  type  5  =  1,  which  has  length  at  most  2n  —  1  [HSS6].  The  same  argument 
applies  to  the  regions  above  /.  • 
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Proving  the  following  generalization  will  ensure  that  our  use  of  the  topological  sweep 

method  will  take  O(m^)  time  also. 

Lemma  12  In  an  arrangement  A' H )  of  n  sinusoidal  curves,  m  the  range  [0.~].  the  total 
size  of  all  the  regions  bordering  on  one  of  the  sinusoidal  curves  I  (having  as  a  side  part  of 
I)  IS  0(n). 

Proof.  The  proof  of  Lemma  11  follows  from  the  fact  that  each  pair  of  lines  intersects 
at  most  once.  This  is  cdso  true  for  sinusoidal  curves  in  the  range  [O.-j.  So  the  proof  of 
Lemma  11  can  be  applied.  • 

It  remains  to  describe  how  to  sort  the  critical  anchors  for  some  point  p.  For  each  point 
p  we  create  two  curves  in  the  dual  space  D.  The  first  curve  represents  the  axes  of  cyhnders 
for  which  p  is  on  the  entering  boundcLrj-  cind  the  second  curve  the  axes  of  cylinders  for  which 
p  is  on  the  leaving  boundary.  If  p  =  (a.  6),  these  are  the  curves  r  ±  e  =  asin^  —  6cos^. 
Each  critical  anchor  is  associated  with  one  intersection  and  all  intersections  correspond  to 
critical  anchors.  Further,  for  each  curve,  the  ordering  of  intersection  points  along  the  curve 
is  exactly  the  rotational  order  of  the  corresponding  critical  anchors.  The  topological  sweep 
method  is  used  to  obtain  the  ordering  of  intersection  points  along  each  curve.  This  \"ields. 
for  each  point  p.  two  sorted  sets  of  critical  anchors:  merging  these  two  sets  provides  the 
sorted  ordering  of  p's  critical  amchors.  Clearly,  this  takes  Oi'm} )  for  all  m  image  points. 

We  have  shown: 

Theorem  6  There  is  an  algorithm  to  determine  the  axes  of  the  e  robust  cylinders  for  a 
point  set  of  size  m;  these  axes  are  reported  m  the  form  of  regions  m  the  dual  plant.  The 
algorithm  r^ns  m  Oirn^)  time. 

An  interesting  question  is  whether  a  parallel  version  of  this  implementation  can  be 
found.  Indeed,  there  is  a  pairEillel  algorithm  for  computing  the  arrangement  of  a  set  of  n 
straight  lines  in  O(logn)  time  using  0(n'^)  operations  [Good90].  However,  it  is  not  clear 
whether  this  algorithm  extends  to  other  classes  of  curves,  such  as  sinusoidal  curves. 
Remark  6.  Clearly,  the  output  can  be  visualized  in  the  dual  space.  In  addition,  it  may 
be  desirable  to  provide  a  hst  of  the  robust  lines  m  a  discretized  version  of  the  dual  space. 
That  is.  in  the  dual  space  a  grid  with  separation  <*.  for  some  6  >  0  is  used;  each  grid  point 
falling  within  one  of  the  regions  of  robust  axes  is  rrjK)rted  (or  perhaps  different  separations 
<5i  and  62  in  the  6  emd  r  directions  are  more  appropnate). 

4.2      Applying  Sampling 

As  in  Section  3.  we  assume  that  t  >  2Inm  -i-  2.  .Again,  we  select  each  point  in  M  with 
probabihtyp  >  (21n  m-r2)/t  into  a  set  S.  The  following  lemma  provides  some  obser\-ations. 
Description  of  an  algorithm  follows. 
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Lemma  13  Let  A  denote  a  line.  Suppose  some  r  points  in  S  are  e-near  to  A.  For  any  3 
and  p,  0  <  3  <  I  and  0  <  p  <  1; 

1.  If  A  IS  robust,  r  >  (1  —  3)ip  with  probability  at  least  1  —  exp{—3^tp/2). 

2.  If  r  >  (I  +  '3)tp,  then  A  is  robust  with  probability  at  least  1  —  exp(  —  ^^^^p/3). 

S.   If  r  >  (\  —  3)tp,  then  at  least  ~^t  points  in  M  are  e-near  to  A  with  probability  at 
leastl-expi-0'\^tp/3). 

An  algorithm  that  uses  sampling:  We  apply  the  geometric  algorithm  already  described  in 
this  section  to  the  sample  5,  using  (1  —  0)tp  as  the  threshold. 

In  order  to  apply  Lemma  13  to  the  algorithm  we  need  to  bound  the  number  of  lines  .4 
that  need  to  be  considered.  We  show  a  bound  of  S|A/|^  lines. 

Definition  A  cylinder  class  is  a  collection  of  cylinders  all  of  which  have  the  same  point(s) 
on  their  boundaries  and  the  same  points  in  their  interior. 

Lemma  14    There  are  at  most  8|A/p  cylinder  classes. 

Proof.  By  Theorem  3  there  are  at  most  2|Mp  cylinders  with  two  (or  more)  points  on 
their  boundaries  (for  there  are  exactly  4|M|(|A/|  —  1)  critical  anchors  each  defining  such  a 
cylinder;  but  each  cylinder  is  counted  twice).  Next  we  consider  cylinder  classes  that  have 
only  one  point  on  their  boundary:  for  each  such  cylinder,  rotate  it  anticlockwise  about  its 
boundary  point  until  a  second  point  is  on  its  boundary.  Since  there  are  at  most  2|A/p 
cylinders  with  two  points  on  boundary,  there  axe  at  most  4|A/p  classes  of  cylinders  with 
at  lecist  one  point  on  their  boundary.  Finally,  we  consider  cylinder  classes  that  have  no 
points  on  their  boundciry;  for  each  such  cylinder,  translate  it  upwards  until  a  first  point 
is  on  its  boundary.  Since  there  are  at  most  4|A/p  cylinders  with  at  least  point  on  their 
boundary,  there  are  at  most  8|A/p  classes  of  cyhnders.  • 

Now,  we  apply  Lemma  13(3)  to  each  cla^s  of  cylinders  ajid  then  use  Proposition  2  to 
obtain:  with  probability  at  least  1  —  8|A/pexp(  — /J^yrf^p/S),  the  regions  reported  by  the 
algorithm  contain  only  axes  of  cylinders  holding  at  leeist  (1  —  \^)t  points.  Of  course,  as 
noted  earlier,  in  Remark  1,  this  is  likely  to  be  a  rather  small  probability.  Still,  as  before, 
we  expect  to  report  most  of  the  regions  of  robust  hnes,  and  possibly  some  other  regions  in 
addition.  Again,  a  postprocessing  phase  is  called  for  to  filter  the  results  computed  here. 

The  sampling  reduces  the  runtime  of  the  previous  algorithm  to  0(|A/|  +  |5|^log  \S\), 
but  at  the  loss  of  some  certainty  regarding  finding  the  complete  regions  of  robust  cylinders. 

4.3      The  Segment-RSL  Problem 

In  turn,  we  describe  a  sequential  and  a  parcdlel  algorithm.  Both  perform  0(|5plog|S|) 
operations,  though  in  practice  we  suspect  they  would  be  considerably  more  efficient;  this 
is  discussed  further  later. 
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4.3.1      The  Sequential  Algorithm 

\\'e  carry  out  essentially  the  geometric  algorithm  described  earlier.  First  we  need  to 
define  a  few  terms.  For  a  given  anchor,  an  angle  a  is  legal  for  segment  a  if  the  line  .4 
through  the  anchor,  sustending  angle  a,  has  the  property  that  a  is  e-near  to  .4.  For  a 
given  anchor,  a  segment  is  legal  if  there  is  some  legal  angle  for  that  segment. 

The  new  algorithm  comprises  essentially  Step  3  of  the  previous  geometric  algorithm. 
The  current  anchor  is  swept  around  the  perimeter  of  the  image.  For  the  current  anclior. 
we  keep  a  balanced  tree  containing  an  entering  and  a  leaving  angle  for  each  segment  a;  the 
entering  angle  is  the  smallest  angle  for  which  a  is  legal,  and  the  leaving  angle  is  the  largest 
such  angle.  The  angles  are  stored  in  rotational  order.  As  before,  changes  to  the  ordering 
of  the  angles  occur  at  critical  anchors.  In  order  to  determine  the  next  critical  anchor,  we 
keep  a  heap  of  candidate  future  critical  anchors.  This  heap  initially  stores  the  following 
critical  anchors:  first,  those  critical  anchors  defined  for  each  segment  a  as  follows:  the 
anchors  at  which  a  becomes  and  ceases  to  be  legal.  Second,  each  critical  anchor  defined  by 
neighboring  angles  in  the  balanced  tree  (such  a  critical  anchor  occurs  when  two  entering 
and  or  leaving  angles  interchange  their  relative  order).  In  the  next  paragraph,  we  discuss 
how  the  heap  and  balanced  tree  are  updated  as  the  current  anchor  is  advanced.  Two 
paragraphs  hence,  we  explain  how  the  baJanced  tree  is  used  to  determine  the  boundaries 
of  the  regions  of  robust  cixes. 

The  processing  of  Step  3  is  straightforward.  For  some  initial  anchor,  the  entries  in  the 
balanced  tree  cire  determined  and  the  associated  heap  is  computed.  Next,  repeatedly,  the 
minimum  item  is  removed  from  the  heap  (the  initial  cinchor  is  assumed  to  be  the  smallest 
anchor);  the  item  removed  from  the  heap  is  the  next  critical  anchor  to  be  processed.  If 
it  represents  a  newly  legal  segment  a,  the  entering  and  leaving  angles  associated  with 
a  are  added  to  the  balanced  tree,  ajid  two  new  values  are  added  to  the  heap  (i.e.,  the 
critical  anchors  determined  by  the  new  entering  amd  leaving  angles  and  their  neighbours 
in  the  balanced  tree);  finally,  one  item  is  deleted  from  the  heap  -  the  critical  anchor 
corresponding  to  the  two  no  longer  adjacent  angles.  A  segment  which  ceases  to  be  legal 
is  processed  similarly.  The  other  possibility  is  that  the  critical  anchor  corresponds  to  an 
interchange  in  the  order  of  two  cmgles;  then  their  order  in  the  balanced  tree  is  updated 
and  appropriate  insertions  and  deletions  are  made  to  the  heap  (two  deletions  and  two 
insertions). 

In  order  to  determine  the  regions  of  robust  cixes,  we  need  to  keep  track  of  the  limits  of 
the  anguleu"  regions  which  contain  segments  of  total  length  at  least  /.  In  order  to  do  this, 
we  keep  a  cummulative  sum  as  before.  But  now  for  the  entering  angle  corresponding  to  a 
line  of  length  /'  we  have  an  additive  term  of  +/',  while  for  the  leaving  angle  we  add  a  term 
of  — /'.  The  boundary'  occurs  whenever  the  sum  goes  from  >  /  to  <  /,  or  from  <  /  to  >  /. 
The  image  of  this  boundary  in  the  dual  space  D  is  computed  and  reported  as  before. 

It  remains  to  discuss  how  to  compute  the  leaving  and  entering  angles  for  a  segment 
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a.  A  little  thought  shows  that  the  entering  angle  is  determined  by  the  second  endpoint 
of  a  to  cross  the  entering  boundary  of  the  e-cylinder  (as  the  cylinder  is  rotated  about  the 
current  anchor).  Similarly  the  leaving  angle  is  determined  by  the  first  endpoint  of  o  to 
cross  the  leaving  boundary  of  the  e-cylinder.  Clearly,  if  the  leaving  angle  is  smaller  than 
the  entering  angle,  the  segment  is  not  legal.  In  addition,  two  angles  become  equal  when 
the  line  through  the  associated  endpoints  determines  the  current  anchor;  this  is  computed 
cLS  before. 

This  procedure  requires  0(log|S|)  time  to  process  each  critical  anchor.  In  addition, 
the  time  required  to  initialize  the  heap  and  the  balanced  tree  is  0(|5|log  |5|).  As  before, 
there  are  0(|Sp)  critical  anchors.  We  have  shown: 

Theorem  7  There  is  an  algoTithm  to  determine  the  axes  of  the  e  robust  cylinders  for  a 
set  S  of  segments.  These  axes  are  reported  m  the  form  of  regions  in  the  dual  plane.  The 
algorithm  runs  in  0{\S\^\og\S\)  time. 

In  fact,  a  critical  ajichor  occurs  for  a  pair  of  segments  only  if  there  is  a  line  .4  such  that 
both  segments  axe  e-neax  to  A.  It  is  plausible  to  assume  that  in  practice  the  size  of  this  set 
is  much  smaller  that  \S\.  So  we  define  the  overlap  count  of  segment  a  to  be  the  number 
of  segments  a'  E  S  such  that  there  is  a  line  A  to  which  both  a  and  a'  Eire  e-near.  Suppose 
the  overlap  count  for  each  segment  in  5  is  bounded  by  r.  Then  we  have  shown: 

Corollary  15  There  is  an  algorithm  to  determine  the  axes  of  the  e  robust  cylinders  for  a 
set  S  of  sgements.  These  axes  are  reported  m  the  form  of  regions  in  the  dual  plane.  If  the 
overlap  count  for  each  segment  m  S  is  bounded  by  r,  the  algorithm  runs  m  0{r  ■  |S|log  \S\) 
tim.e. 

Sampling  could  be  applied  here  too,  with  a  segment  being  selected  with  a  probability 
proportional  to  its  length.  But  unless  the  number  of  segments  defining  the  typical  robust 
line  is  large,  scmipling  will  not  be  paxticulaxly  helpful.  We  supsect  that  the  situations  in 
which  Corollary  15  is  useful  will  prove  more  typical. 

Again,  we  could  describe  an  algorithm  with  an  0|S|^)  performance.  However,  as  it  is 
not  clear  that  a  complexity  of  the  type  stated  in  Corollary  15  can  be  achieved,  we  are  not 
exploring  this  further  here. 

Remark  7.  We  comment  here  on  the  definition  of  the  segment-RSL  problem.  First,  in 
practice  it  may  be  appropriate  to  ignore  short  segments  for  it  is  not  clear  that  they  convey 
useful  information  concerning  a  robust  line.  Thus,  for  example,  one  might  exclude  cdl  lines 
of  length  less  that  lOe.  Second,  one  might  wajit  to  require  that  in  addition  to  being  e-neax 
to  a  robust  line,  a  segment  should  have  a  neairby  orientation  also.  So  we  could  put  a  bound 
of  Q  on  the  legal  range  of  orientations.  This  additional  restriction  is  easily  implemented 
in  our  algorithm;  it  simply  requires  a  modification  to  the  entering  and  leaving  angles: 
now  they  are  restricted  to  deviate  no  more  than  q  from  the  orientation  of  the  segment. 
Third,  instead  of  using  the  length  /  of  a  segment  to  measure  its  contribution  to  a  robust 
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line,  it  might  be  desirable  to  use  the  projected  length.  But  again,  this  is  readily  carried 
out:  the  projected  length  will  be  a  term  of  the  form  asin^  +  bcos9,  where  9  denotes  the 
orientation  of  the  candidate  robust  line  and  a  and  b  are  constants  determined  by  the  length 
and  orientation  of  the  segment.  If  the  first  rule  is  implemented,  the  second  rule  will  only 
have  a  modest  effect,  for  the  difference  in  lengths  will  be  less  than  1%. 

4.3.2      The  Parallel  Algorithm 

The  pciraJlel  algorithm  is  very  similar  to  the  previous  parallel  algorithm.  For  each 
segment,  we  compute  the  portions  of  the  boundaries  of  the  regions  of  robust  axes  it  deter- 
mines. Specifically,  for  each  segment  a,  we  compute  a  sequence  of  counts  associated  with 
its  entering  and  leaving  angles,  respectively.  This  is  computed  for  the  range  of  anchors  for 
which  the  segment  is  legal.  We  explain  this  in  more  detail  for  the  entering  angle.  For  each 
segment  a'  we  determine  the  range  of  anchors  such  that  its  entering  and  leaving  angles 
straddle  the  entering  ajigle  of  a  (this  is  determined  by  the  critical  angles  induced  by  a  and 
a').  With  this  range  we  associate  an  increment  /',  equal  to  the  length  of  a'.  In  order  to 
sum  these  increments,  we  associate  an  increment  of  +/'  with  the  start  of  the  range  and  an 
increment  of  — /'  with  the  end  of  the  range.  We  compute  the  prefix  sums  of  the  increments. 
A  change,  in  the  prefix  sum,  as  before,  from  <  /  to  >  /,  or  conversely,  corresponds  to 
the  segment  a  determining  a  portion  of  the  boundary  of  a  region  of  robust  ajces.  This  is 
haindled  as  in  the  previous  parallel  algorithm.  As  in  the  previous  parallel  algorithm,  we 
use  a  sorting  algorithm  and  a  prefix  sums  algorithm,  and  for  each  segment  they  are  being 
applied  to  0(|5|)  items.  We  have  shown: 

Theorem  8  There  is  a  parallel  algorithm  to  determine  the  axes  of  the  e  robust  cylinders 
for  a  set  S  of  segm.ents.  These  axes  are  reported  m  the  form  of  regions  m  the  dual  plane. 
The  algorithm  performs  0(|5|^  log  |S|)  operations  m  0(log|S|)  time. 

We  have  not  been  able  to  find  an  algorithm  which  admits  a  complexity  similar  to  that 
specified  by  Corollary  15.  We  leave  this  as  an  open  problem. 

4.4      Further  Directions 

One  natural  question  is  how  to  choose  e.  One  approach  that  suggests  itself  is  to  use 
2Ln  adaptive  strategy:  For  instance,  to  repeatedly  double  the  value  of  e  until  the  regions 
of  robust  axes  ceeise  to  grow  significantly  (obviously  these  regions  will  grow  slightly  as  e 
increases). 

A  possible  extension  of  this  approach  is  to  adapt  e  to  each  region,  for  one  can  imagine 
that  there  might  be  different  densities  of  points  or  segments  in  different  portions  of  the 
image. 
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